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^^^ur knowledge of psychiatric and substance-use 
genetics comes from two key fields of research, both 
dynamic areas in rapid change. First, genetic epidemiol- 
ogy asks whether there is risk in excess of the population 
baseline in the relatives of cases, and, if so, whether the 
excess risk is attributable to the genetic factors or the 
environments they share. Beyond simply estimating her- 
itability, genetic epidemiology has evolved to address 
more sophisticated questions, such as whether liability 
genes have the same effects across the lifespan, how they 
may influence multiple disorders, and how they might 
interact with environmental risks. 
Genetic epidemiology of psychiatric and behavioral phe- 
notypes has consistently demonstrated that: i) genetic 
risk factors are, in aggregate, important etiological com- 
ponents; ii) they cannot completely account for observed 
risk, meaning these phenotypes are multifactorial traits, 
with important nongenetic (or environmental) con- 
tributing factors; and iii) the risk alleles appear to be of 
small effect size and to occur in a large number of genes. 
Psychiatric and behavioral phenotypes are influenced by 
a large number of risk factors that individually are within 
the range of normal human variation and produce mod- 
est individual increases in risk. 

The initial goal of the second major research area, mole- 
cular genetics, is to identify genes which influence these 
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Both genetic and nongenetic risk factors, as well as inter- 
actions and correlations between them, are thought to 
contribute to the etiology of psychiatric and behavioral 
phenotypes. Genetic epidemiology consistently supports 
the involvement of genes in liability. Molecular genetic 
studies have been less successful in identifying liability 
genes, but recent progress suggests that a number of spe- 
cific genes contributing to risk have been identified. 
Collectively, the results are complex and inconsistent, with 
a single common DNA variant in any gene influencing 
risk across human populations. Few specific genetic vari- 
ants influencing risk have been unambiguously identified. 
Contemporary approaches, however, hold great promise 
to further elucidate liability genes and variants, as well as 
their potential inter-relationships with each other and 
with the environment. We will review the fields of 
genetic epidemiology and molecular genetics, providing 
examples from the literature to illustrate the key concepts 
emerging from this work. 
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phenotypes and to identify the specific risk variants within 
them. There are substantial differences in DNA sequences 
between individuals, and gene identification methods test 
whether specific alleles at these variable positions are 
more common in affected than in unaffected individuals, 
most commonly with linkage studies (in families) and 
association studies (primarily in case/controls, but also in 
numerous other designs). We will discuss the underlying 
causes of these two genetic phenomena, the methods for 
detecting them, and the limitations of each. 
The second goal of molecular genetics is to identify spe- 
cific risk alleles and to use functional studies to elucidate 
how a gene functions normally, how the risk allele alters 
normal function, and how these alterations contribute to 
disease. The aim of this work is to explain the aggregate 
genetic risks observed through the effects of risk alleles 
on gene expression, protein structure and function, 
and/or biological processes. This area remains largely 
unsuccessful to date for complex traits generally. 
In this review we focus on the basic methods of genetic 
epidemiology and molecular genetics, and provide exam- 
ples, across a variety of psychiatric and substance use dis- 
orders, of questions currently being addressed. In con- 
trast to this first section on genetic epidemiology, the 
sections on molecular genetics focus narrowly on schiz- 
ophrenia, where there is a much longer history of mole- 
cular genetic studies, because we judged that emphasiz- 
ing a single disorder would provide a more coherent 
example of ongoing research progress and challenges. 

Basic genetic epidemiology 

The most fundamental question addressed by psychiatric 
genetic epidemiology is whether a particular trait or dis- 
order shows evidence for genetic influence. Both twin 
and adoption studies provide methods to address this 
question and tease apart the degree to which genetic and 
environmental influences are important on a given out- 
come. Twin studies accomplish this by comparisons of 
the similarity of monozygotic twins (MZs; who share 
100% of their genetic variation), with dizygotic twins 
(DZs; who share on average just 50% of their genetic 
variation). Adoption studies compare similarity among 
adopted-apart biological relatives, who share genetic 
variation, but not their environments, and adoptive rel- 
atives, who share their environment, but not their 
genetic makeup. Through these comparisons, we can 
quantify the degree to which genetic influences con- 



tribute to individual differences in risk, a statistic com- 
monly referred to as the heritability of the trait. These 
study designs have been applied to virtually all psychi- 
atric disorders and to a number of related traits, yielding 
compelling evidence that genetic influences play a criti- 
cal role in virtually all psychiatric outcomes. There is 
considerable variability in the magnitude of genetic 
influence across different disorders. On the high end are 
disorders such as schizophrenia, bipolar disorder, and 
autism, which yield heritability estimates of the order of 
80% or higher. Alcohol and other drug dependence 
shows moderate heritability, in the range of 50% to 60%. 
On the lower end of the spectrum, though still showing 
significant evidence of genetic influence, are anxiety and 
depressive disorders, as well as eating disorders, which 
yield heritability estimates of -30% to 40%. So, while 
there is variability in the magnitude of importance of 
genetic effects, it is widely accepted that a significant 
genetic component plays a role in virtually all psychiatric 
traits. It is a sign of the paradigm shift that has taken 
place in psychiatry that heritability estimates are no 
longer considered controversial, since the original stud- 
ies finding evidence for genetic effects represented 
strong challenges to predominant views favoring envi- 
ronmental theories on the causation of most psychiatric 
conditions, ranging from schizophrenia to autism to alco- 
hol dependence — disorders that are all now widely rec- 
ognized as having genetic components. 
While demonstration of heritability played an important 
role in altering fundamental assumptions about the eti- 
ology of psychiatric disorders, if not understood in their 
proper context, heritability estimates can also have a 
number of unfortunate side effects. Firstly, the heritabil- 
ity statistic created a dichotomy of genetic versus envi- 
ronmental influence — nature versus nurture. How much 
is genetic? How much is environmental? This is, as we 
hope to show, a somewhat arbitrary distinction. Genetic 
predispositions by necessity are expressed in the context 
of the organism's environment, and the environment can 
differentially affect individuals based on their unique 
genetic makeup. Further, many environments are not 
simply "imposed" on an individual; rather, individuals 
play an active role in selecting and shaping their envi- 
ronments. Accordingly, it is generally more informative 
to elucidate pathways of risk and show how genetic and 
environmental influences come together in this process, 
rather than trying to divide influence into that which is 
genetic and that which is environmental. Secondly, 
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demonstration of heritability led to the idea that there 
were genes "for" a given disorder. More complex mod- 
els that have examined genetic influences across multi- 
ple different conditions suggest that the Diagnostic and 
Statistical Manual of Mental Disorders (DSM) structure 
of psychiatric diagnoses often does not map onto the 
underlying genetic architecture of psychiatric traits. 
Genetic influences appear to be shared across many psy- 
chiatric conditions, and likely operate through mediat- 
ing characteristics that alter risk for a number of differ- 
ent outcomes. Finally, static heritability estimates fail to 
capture the dynamic nature of genetic and environmen- 
tal influences on psychiatric outcome. Heritability esti- 
mates are specific to the population under study. Lost in 
heritability estimates are potential differences across 
environmental conditions, across populations or gender, 
and across ages. Accordingly, genetic epidemiology has 
undergone an evolution in the kinds of questions being 
addressed. No longer is the question simply "Are genetic 
influences important on Trait X?" or even "How impor- 
tant are genetic influences on Trait X?". Rather, the 
focus has shifted to addressing the complexities raised 
here, using the paradigm we have called advanced 
genetic epidemiology. 

Advanced genetic epidemiology 

Moving beyond genes versus environment: 
gene-environment interaction and correlation 

Parsing genetic and environmental influences into sep- 
arate sources represents a necessary oversimplification, 
as for most traits we know about, genetic and environ- 
mental influences are inexorably intertwined. Most mea- 
sures of the environment show some degree of genetic 
influence, illustrating the active role that individuals play 
in selecting and creating their social worlds. 1 To the 
extent that these choices are impacted upon by an indi- 
vidual's genetically influenced temperaments and behav- 
ioral characteristics, an individual's environment is not 
purely exogenous, but rather, in some sense, is in part an 
extension and reflection of the individual's genotype. 
This concept is called gene-environment correlation or, 
perhaps more descriptively, genetic control of exposure 
to the environment. It is likely an important process in 
the risk associated with several psychiatric outcomes. For 
example, there is considerable evidence for peer 
deviance being associated with adolescent substance use. 



However, individuals play an active role in selecting 
their friends, and multiple genetically informative sam- 
ples have now demonstrated that a genetic predisposi- 
tion toward substance use is associated with the selec- 
tion of other friends who use substances. 2 " 4 Interestingly, 
there is evidence that genetic effects on peer-group 
deviance show a strong and steady increase across devel- 
opment, 5 suggesting that as individuals get older and 
have increasing opportunities to select and create their 
own social environment, genetic factors assume increas- 
ing importance. Another area where gene-environment 
correlation is known to play a significant role is in the 
risk pathways associated with depression. Stressful life 
events have been consistently associated with the man- 
ifestation of depression. However, there is evidence for 
genetic influence on the occurrence of stressful life 
events, 6,7 indicating that an individual's predisposition 
plays a role in the likelihood that they will experience 
difficulties that are then associated with risk for depres- 
sive episodes. For example, research has shown that a 
genetic liability to major depression increases the risk 
for a range of stressful life events, particularly those 
reflecting interpersonal and romantic difficulties. 8 These 
represent only a couple of areas where individuals are 
known to play an active role in shaping environmental 
factors that are associated with subsequent risk for psy- 
chiatric problems. 

Another way that genetic and environmental influences 
are linked is via gene-environment interaction or, as we 
might prefer, genetic control of sensitivity to the envi- 
ronment. In these situations, genetic influences may vary 
in importance as a function of environmental conditions 
and/or that the environment differs in importance as a 
function of an individual's genetic predisposition (these 
two conceptualizations of gene-environment interaction 
are indistinguishable statistically). Heritability estimates 
essentially average across environments; accordingly, if 
there is reason to believe that the importance of genetic 
effects might vary as a function of the environment, this 
information can be incorporated into the twin model to 
test for significant differences in heritability as a func- 
tion of the environment. Substance use provides one 
area where gene-environment interaction effects have 
been found to be particularly important. Environments 
that exert more social control and present less opportu- 
nity to engage in substance use consistently show 
reduced evidence for the importance of genetic effects. 
In this sense, the environment is essentially constraining 
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the expression of a predisposition toward substance 
use/problems. This has been demonstrated with respect 
to enhanced parental monitoring in adolescents, 9 a more 
religious upbringing, 10 and enhanced community stabil- 
ity, 11 among other factors. One nice example of this can 
be found in an analysis of the heritability of adolescent 
smoking across the United States using data from the 
National Longitudinal Study of Adolescent Health. 
Genetic influences on daily smoking were lower in states 
with relatively high taxes on cigarettes and in those with 
greater controls on vending machines and cigarette 
advertising, again suggesting the importance of social 
control mechanisms in moderating the importance of 
genetic influences on substance use. 12 

Delineating phenotypic boundaries of genetic risk 

The rationale of the basic twin design can be expanded 
to examine the extent to which genetic and environ- 
mental factors contribute to the co-occurrence of psy- 
chiatric conditions. Comorbidity among psychiatric dis- 
orders is common, and multivariate twin studies have 
helped address the etiological mechanisms that con- 
tribute to these observed epidemiological patterns. A 
fascinating result to emerge from these studies is that 
psychiatric conditions with distinct clinical presentations 
(eg, major depression and anxiety) are not necessarily 
distinct genetically. For example, a study of major 
depression and generalized anxiety disorder found a 
genetic correlation of 1.0, suggesting that the same 
genetic influences impact depression and anxiety, but 
differences in environmental experiences contribute to 
the manifestation of different outcomes. 13 An expanded 
study that examined the genetic and environmental 
architecture across seven common psychiatric and sub- 
stance-use disorders found that genetic influences load 
broadly onto two factors that map onto internalizing dis- 
orders (depression, anxiety disorders), and externalizing 
disorders (alcohol and other drug dependence, child- 
hood conduct problems, and adult antisocial behavior). 14 
These findings indicate that while distinguishing these 
disorders as "separate conditions" in the DSM may be 
useful for clinical purposes, these categories do not nec- 
essarily reflect differences in biological etiology. These 
findings, along with similar results from phenotypic 
analyses (eg, refs 15,16) have led some to suggest a reor- 
ganization of the "metastructure" of psychiatric disor- 
ders in DSM-V. 



Another area of investigation examines whether there 
are differences in the importance of genetic and envi- 
ronmental factors at different stages of the disorder. For 
example, the development of substance dependence is 
necessarily preceded by several stages, including the ini- 
tiation of the substance, the progression to regular use, 
and the subsequent development of problems, whether 
they be psychological, social, and/or physiological. Twin 
studies can investigate the degree to which each of these 
steps in the pathway of risk is influenced by genetic 
and/or environmental factors, and the extent to which 
the same or different genetic/environmental factors 
impact different stages. For example, data from two pop- 
ulation-based, longitudinal Finnish twin studies found 
that shared environmental factors played a large role in 
initiation of alcohol use, and a more moderate role on 
frequency of use, and it was largely the same influences 
acting across these stages of use. However, there was no 
significant evidence of shared environmental influences 
on alcohol problems in early adulthood. Problems were 
largely influenced by genetic factors that overlapped 
with genetic influences on frequency of use. 17 In a study 
from Virginia in male twins, similar results were found 
for alcohol, cannabis, and nicotine. 18 In the early years of 
adolescence, shared environmental influences were 
responsible for nearly all twin resemblance for levels of 
intake of these psychoactive substances. However, as 
individuals aged, the impact of shared environment 
decreased and that of genetic factors increased. 
Finally, there is known to be tremendous heterogeneity 
among individuals with psychiatric conditions. Twin stud- 
ies can provide insight into whether clinical hetero- 
geneity may reflect differences in etiological risk factors. 
For example, alcohol dependence with comorbid drug 
dependence has been found to be a particularly herita- 
ble form of the disorder, 1920 and twin studies have sug- 
gested a genetic influence on typical versus atypical 
forms of major depression. 21 

Changing genetic influence across development 

Another active area of research is the clarification of 
how genetic and environmental influences may change 
across development. A recent meta-analysis examined 
published studies with at least two heritability time 
points across adolescence and young adulthood for eight 
different behavioral domains. These analyses revealed 
significant cross-time heritability increases for external- 
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izing behaviors, anxiety symptoms, depressive symptoms, 
IQ, and social attitudes, and nonsignificant increases for 
alcohol consumption and nicotine initiation. The only 
domain that showed no evidence of heritability changes 
across time was attention-deficit/hyperactivity disorder. 22 
Similarly, in a large study of >11 000 pairs of twins from 
four countries, the heritability of general cognitive abil- 
ity was found to increase significantly and linearly from 
41% in childhood (9 years) to 55% in adolescence (12 
years) and to 66% in young adulthood (17 years). 23 The 
robust finding of increases in the importance of genetic 
influences across development likely reflects, in part, 
active gene-environment correlation, as individuals 
increasingly select and create their own experiences 
based on their genetic propensities. 
In addition to changes in the relative magnitude of 
importance of genetic and environmental influences, 
another dynamic change is that different genes may be 
acting at different time points. This is nicely illustrated 
in recent analyses of alcohol use problems, as assessed 
at five time points from ages 19 to 28 in the Dutch Twin 
Registry (Kendler et al, in preparation). Kendler and 
colleagues found strong innovation and attenuation of 
genetic factors across this age range — indicating that 
some genetic influences on alcohol problems that were 
evident at age 19 declined in importance across time, 
while new genetic influences became important starting 
at ages 21 and 23. Thus, although the overall heritability 
of alcohol problems remained fairly stable, it appeared 
that different genetic factors were important at different 
timepoints. In analyses in the TCHAD Swedish study 
which followed twins from ages 9 to 20 across four waves 
of assessment, large changes were seen in the genetic 
risk factors for fears and phobias 24 and for symptoms of 
anxiety and depression, 25 with particularly pronounced 
evidence for genetic innovation at puberty. These analy- 
ses suggest that genetic influences of many psychiatric 
and substance use disorders are likely to be develop- 
mentally dynamic. 

Sex differences 

Sex differences in the prevalence of psychiatric disor- 
ders, and in risk and protective factors associated with 
psychiatric outcomes, are widespread in epidemiology. 
Twin studies allow us to investigate the extent to which 
there are differences in the relative importance of 
genetic and environmental influences on outcome, and 



the extent to which different genes and/or environments 
may be important. Large-scale twin studies have sug- 
gested, for example, that the genetic risk factors for both 
depression 26 and alcohol dependence, 27 while correlated, 
are not entirely the same for males and females. Results 
from two large twin studies in the US and Sweden agree 
that the genetic influences of major depression are mod- 
estly stronger in women than in men. 26 - 28 

Do we still need twin studies in the era of 
gene finding? 

As advances in molecular genetics and statistical analy- 
sis have made it possible to conduct large-scale projects 
aimed at identifying the specific genes involved in sus- 
ceptibility to psychiatric outcome (detailed in the next 
sections), some have raised questions about the contin- 
uing utility of genetic epidemiology. The argument is that 
heritability has now been established, which provides the 
foundation and justification for moving beyond twin 
studies, on to large-scale gene identification projects. 
However, as detailed in this paper, most twin studies are 
no longer conducted simply to test for the presence of 
genetic effects; rather, they focus on the more complex 
kinds of questions summarized above. These analyses 
are not only informative about the nature of etiological 
pathways of risk, but they can also be used to guide gene 
identification efforts and to further our understanding 
of the risk associated with specific genes as they are 
identified. 

Currently, gene-finding efforts for psychiatric disorders 
(and other common, complex medical conditions) have 
met with limited success. Findings from genetic epi- 
demiology can be used to inform the phenotypes used in 
gene-finding studies. For example, based on the twin lit- 
erature (reviewed above) suggesting that much of the 
predisposition to alcohol dependence is via a broad 
externalizing factor, externalizing factor scores were cre- 
ated in the Collaborative Study on the Genetics of 
Alcoholism (COGA) sample, comprised of symptoms of 
alcohol and other drug dependence, and childhood and 
adult antisocial behavior, as well as the personality traits 
of novelty-seeking and sensation-seeking, which also 
index general behavioral disinhibition. This latent exter- 
nalizing factor score was then used in both linkage and 
association analyses, with results compared with analyz- 
ing separately the individual symptoms of each of the 
psychiatric disorders that went into the creation of the 
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general externalizing score. 29 The results demonstrated 
that this broader externalizing phenotype was useful in 
both linkage and association analyses, suggesting that 
creating phenotypes grounded in the twin literature can 
aid in identifying susceptibility genes. Twin data has also 
been used to aid in genetic association studies in the area 
of internalizing disorders. Using data from the Virginia 
Adult Twin Study of Psychiatric and Substance Use 
Disorders, multivariate structural equation modeling was 
used to identify common genetic risk factors for major 
depression, generalized anxiety disorder, panic disorder, 
agoraphobia, social phobia, and neuroticism. Cases and 
controls were then identified for genetic association 
studies based on scoring at the extremes of the genetic 
factor extracted from the twin analysis, with the subse- 
quent association analyses yielding evidence for associ- 
ation with the gene GAD1. 30 

Another area where genetic epidemiology intersects 
with gene identification efforts is in the characterization 
of risk associated with identified genes. Most major 
gene identification efforts for psychiatric disorders cur- 
rently focus on adult psychiatric outcomes. As we iden- 
tify genes that are reliably associated with these disor- 
ders, one of the next interesting research challenges will 
be to study how risk associated with these genes unfolds 
across development and in conjunction with the envi- 
ronment. Here, findings from genetic epidemiology can 
again be useful in developing hypotheses to test the risk 
associated with specific genes. For example, based on 
the twin literature suggesting that adult alcohol depen- 
dence and childhood externalizing symptoms overlap in 
large part due to a shared genetic predisposition, 31 genes 
that were originally identified as associated with adult 
alcohol dependence (eg, GABRA2 32 CHRM2 33 ) have 
been tested for association with externalizing behavior 
in younger samples of children and adolescents. These 
studies suggest that children carrying the genetic vari- 
ants associated with alcohol problems later in life dis- 
play elevated rates of conduct problems earlier in devel- 
opment, before any association with alcohol 
dependence has manifested. 34 36 Further, based on the 
twin literatures suggesting that genetic influences on 
externalizing behaviors are moderated by parental 
monitoring 9 and peer deviance, 37 38 further analyses 
demonstrated that the associations between these genes 
and externalizing behavior were stronger under condi- 
tions of lower parental monitoring and higher peer 
deviance. Characterizing the risk pathways associated 



with identified genes will be critical in eventually trans- 
lating this information into improved prevention and 
intervention programs. 

Gene identification methods 

The field of psychiatric genetics has used two different 
methods to attempt to identify individual risk genes: 
linkage and association. These are fundamentally differ- 
ent approaches with different study designs applied, until 
recently, to very different research questions. It is impor- 
tant to understand both in order to understand why 
association approaches have become the norm in follow- 
up studies of linkage regions as well as the primary cur- 
rent approach in genome-wide studies. 

DNA polymorphisms 

Humans are -99.9% identical at the nucleotide level on 
average. Molecular genetic studies depend critically on 
the remaining 0.1% (~3 million nucleotides) where vari- 
ation occurs between individuals, collectively known as 
genetic polymorphisms or markers. Linkage studies gen- 
erally use short tandem repeat polymorphisms (STRs). 
STR alleles are differing numbers of a repeating unit of 
nucleotides and have specific sequence lengths and mol- 
ecular weights as a result, allowing them to be separated 
and identified. STRs are very common and tend to be 
extremely polymorphic (ie, to have many alleles — where 
an allele is one of the possible variants that exist in a 
population at a particular genetic locus) and therefore 
to have high heterozygosity (the proportion of individ- 
uals who have two different alleles at the marker locus). 
This high heterozygosity is important for linkage analy- 
ses, which require a unique allele at each position on 
each homologous chromosome to be informative. 
In contrast, single nucleotide polymorphisms (SNPs) are 
changes of a single base or insertion/deletion variation 
up to a few nucleotides in size. SNPs generally have only 
two alleles, and have lower heterozygosity and lower 
information content. Association studies tend to use 
SNPs as the marker of choice, because alleles of these 
markers evolve more slowly than those of STRs and pre- 
serve more of the evolutionary relationships on which 
genetic association is based. SNPs can also be used for 
linkage, but about ten times as many SNPs as STRs are 
required to capture the linkage information. 
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Linkage 

In marker genotype data from families, new combina- 
tions of alleles at a series of markers on individual chro- 
mosomes are observed in each generation. This recom- 
bination of alleles is observed because there is at least 
one physical exchange of material (or crossover) 
between each homologous chromosome pair in every 
meiosis (Figure 1). Recombination between loci on dif- 
ferent chromosomes (because of independent assort- 
ment of homologous chromosome pairs) or far apart on 
the same chromosome (because of crossover at meiosis) 
is observed 50% of the time. Linkage is observed 
between loci in close proximity on a chromosome 
because their alleles are separated by crossover less than 
50% of the time. 

Mendelian diseases are caused by mutations in a single 
gene at a single chromosomal location, so disease phe- 
notypes can be treated as marker alleles in linkage 
analysis. Because these illnesses are rare, for a dominant 
disorder, the rare risk allele must segregate from one 
parent (often affected or with family history) into 
affected offspring, or arise as an even rarer de novo 
mutation. By following the segregation of marker alle- 
les from the affected lineage into offspring, linkage 
between markers and phenotypes can be observed when 
affected offspring inherit a particular set of marker alle- 
les (and thus a specific parental chromosomal segment) 
compared with their unaffected relatives. 

Association 

While linkage occurs in families, association is a popu- 
lation-based phenomenon. Genetic association studies 
test whether specific alleles at variable sites are more 
common in individuals affected by a disease (cases) than 
individuals not affected by the disease (controls). This 
association between allele and phenotype can occur for 
two reasons. Either the allele being studied directly influ- 
ences risk for the disorder or, more commonly, the allele 
is in linkage disequilibrium (LD) with the disease-pre- 
disposing allele. Linkage disequilibrium means that spe- 
cific alleles at two nearby loci tend to occur together in 
an entire population. Linkage, (the cosegregation of a 
chromosome region and a disease observed in families), 
occurs at scales of tens of millions of base pairs because 
of the limited number of recombinations observed in 
each generation of a family. Association (and LD) are 



seen at scales of thousands to tens of thousands of base 
pairs, because the number of recombinations present in 
the evolutionary history of a population is large, mean- 
ing that the physical distances between loci in LD must 
be correspondingly small if recombination is to occur 
rarely (if ever) between them. 

LD occurs because a new allele always arises on a spe- 
cific background chromosome (and its existing haplo- 
type of marker alleles), and will, until separated by 
recombination, only exist in conjunction with the other 
alleles present on that background. Over time, the orig- 
inal LD (and thus the genetic association) between more 
distant loci decays as a result of recombination events, 
while the rarity of recombination between nearby loci 
preserves the original LD and association. Association 
can also be detected spuriously, eg, if observed differ- 
ences in allele frequency are due to population differ- 
ences rather than to true association between marker 
and phenotype. Association approaches are also sub- 
stantially reduced in power in the presence of allelic het- 
erogeneity (the existence of more than one risk allele at 
a locus), while this phenomenon has no effect on the 
detection of linkage. 

Challenges associated with gene identification in psy- 
chiatric and substance-use disorders 

A number of features of psychiatric and behavioral phe- 
notypes contribute to an overall reduction in study 
power. Association is more powerful, generally for 
detecting genes of small effect, 39 but the specific features 
of psychiatric and behavioral phenotypes also reduce the 
power of association studies. 

First, psychiatric phenotypes are almost certainly influ- 
enced by multiple common alleles of small effect in many 
genes. Both linkage and association study designs are 
more powerful for alleles of large effect size, and are 
much less powerful when examining highly polygenic 
phenotypes. Replication studies are hampered by the 
need for sample sizes larger than the discovery sample 
(in order to maintain power) and stochastic sampling 
variation, the expected variation in the extent to which 
any specific risk factor is present (and association 
detectable) in any particular sample. 

Second, interactions between genes (GxG) or between 
genes and environmental variables ( GxE) seem necessary 
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to account for observed risks, but we rely heavily on ana- 
lytic approaches that assess single genes. In a few cases, 
genes with known molecular interactions with the can- 
didates have also generated replicated association. 
Environmental risk factors remain largely unknown and 
are difficult or very expensive to test in many samples. 

Third, these phenotypes are common, so the liability alle- 
les seem likely to be common, although increased rates of 
rare deletions and duplications (structural or copy num- 
ber variants) in cases have been observed multiple times 
and suggest that rare variation may also contribute to 
risk in a proportion of cases. The common risk variants 
are expected to occur with relatively high frequency in 
the general population, reducing contrast between 
affected and unaffected individuals and reducing power. 
The impact of individual rare structural variants in the 
subset of cases where they are observed is harder to 
assess currently, but the observation of an aggregate 
increase appears robust, further increasing the apparent 
etiological complexity. 

Fourth, the expected frequency of risk alleles and the 
clinical variability in presentation, course, and outcome 
suggest that the etiology of individual cases may be het- 
erogeneous, derived from different specific genes or alle- 
les between individuals. Allelic heterogeneity substan- 
tially reduces the power of association designs. 

Fifth, diagnostic boundaries are difficult to draw, and the 
best phenotype to study is a complex choice. It is criti- 
cally important to consider this last point and the phe- 
notypes that yield the strongest evidence in some detail. 

An example: schizophrenia 
gene identification 

Through 2004, 25 complete or nearly complete genome 
scans for schizophrenia (in which about 400 individual 
genetic markers are genotyped at regular intervals over 
the entire human genome) were published (for review 
see refs 40,41). None provided evidence for genes of 
major effect. Some linkage regions were replicated in 
these studies, and a number of promising genes emerged 
from sequential linkage and association studies and mul- 
tiple replication reports. We focus here on those regions 
with the best replication record and with evidence 
emerging from other contemporary studies: 22ql2-ql3, 



8p22-p21 , 6p24-p22, and 1 q32-42. Two additional regions 
with little support in the primary literature, 2pll.l-q21.1 
and 3p25.3-p22.1, were among the most significant in a 
meta-analysis of schizophrenia genome scans. A number 
of other regions (including 5q22-q31 and 15ql3-ql4) 
have less strong summary evidence but also overlap with 
evidence from more recent GWAS and structural varia- 
tion studies. 

Chromosome 22q, the VCFS microdeletion, 
and COMT 

Chromosome 22q has been widely studied using many 
different designs. Primary linkage signals were observed 
in a few samples but have generally been widely repli- 
cated. However, the cosegregation of a known 
microdeletion in the region with a phenotype in which 
psychosis is a common feature added significantly to 
interest in this region. Velo-cardio-facial syndrome 
(VCFS) is caused by two overlapping, recurrent dele- 
tions at 22qll. Historically, about 10% of VCFS patients 
were thought to present with a psychotic phenotype, but 
more recent studies suggest much higher rates of 25% 
to 29 %. 42,43 Conversely, preliminary results suggest that 
about 2% of adult onset and 6% of childhood onset 
schizophrenic patients have microdeletions in this 
region, in excess of the estimated general population fre- 
quency of such deletions of 0.025%. 44 Interest in this 
region has been further increased recently by studies 
assessing structural variation (see below). The gene for 
catechol-O-methyl transferase (COMT), involved in the 
degradation of catecholamines, maps to this region; the 
enzyme is functionally polymorphic with a variable 
amino acid, Vall58Met, affecting activity. Although 
widely studied, the results from genetic studies of 
COMT are inconclusive as reviewed recently. 45 

Chromosome 8p22-p21, NRG1, and ERBB4 

Studies of pedigrees from numerous different ethnic 
backgrounds have detected linkage to schizophrenia on 
8p, as did a statistically robust meta-analysis. 46 Although 
numerous samples support a locus on 8p, comparison 
between individual studies is consistent with the pres- 
ence of multiple susceptibility genes, a feature of a num- 
ber of linkage regions. Almost certainly the most impor- 
tant result on 8p so far is the widely replicated 
association with the neuregulin 1 (NRG1) gene in fami- 
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lies and case/controls from Iceland. 47 NRG1 is a large 
gene with multiple transcripts yielding distinct protein 
molecules. It is expressed at central nervous system 
synapses and is involved in the expression and activation 
of neurotransmitter (including glutamate) receptors. 
Initial replication studies 4849 detected association on hap- 
lotypes identical or closely related to those identified in 
the Icelandic cases; 13 additional studies in multiple pop- 
ulations reported association with more variation in 
associated alleles or haplotypes, 50 62 while nine studies did 
not. 63 71 A meta-analysis of studies of NRG1 supported 
involvement of the gene in schizophrenia liability, but 
did not provide evidence supporting association of the 
most prominent marker in the original studies. 72 In a pat- 
tern observed for a number of the best supported schiz- 
ophrenia genes, several studies have also shown associ- 
ation between NRG1 and bipolar disorder. 627374 
ErbB4, encoded by the ERBB4 gene, is a receptor for 
NRG1 and has important roles in neurodevelopment 
and the modulation of NMDA receptor functioning. 
Both activation of ErbB4 and suppression of NMDA 
receptor activation by NRG1 are increased in the pre- 
frontal cortex in individuals with schizophrenia com- 
pared with controls. 75 This functional relationship 
prompted genetic study of ERBB4, which demonstrated 
association in ERBB4 and evidence of interaction with 
NRGl. 59 ^™ Associated alleles in ERBB4 alter splice- 
variant expression 79 and both NRG1 and ErbB4 protein 
are increased in the brain in schizophrenia. These results 
may be of particular importance as there is a biologically 
plausible mechanism for gene x gene interactions, and 
even if the interaction is not confirmed, both genes 
impact the glutamatergic system (supporting the widely 
held view that part of the complexity may be explained 
by effects at the level of the pathway or system). 
Important tests of both interaction and system effects 
unbiased by candidate selection will be undertaken in 
the current GWAS datasets. 

Chromosome 6p24-p22, DTNBP1, and the 
HLA region 

Chromosome 6 has a long history in genetic studies of 
schizophrenia with major shifts in the apparent impor- 
tance of particular results. Early linkage studies observed 
evidence of linkage in human leukocyte antigen (HLA) 
genes in the major histocompatibility complex (MHC) 
region on chromosome 6p21. 3-22.1, but the limited 



genome coverage (only -6%) and lack of replication 
reduced the apparent importance of these findings. The 
first strong evidence for linkage of schizophrenia to the 
6p region came from studies of Irish families with a high 
density of disease. 80 This study was also important 
because it addressed the question of diagnostic bound- 
aries in some detail. Evidence for linkage was modest 
under a narrow diagnostic model, increased substantially 
as the diagnostic definition broadened to include psy- 
chosis spectrum disorders, and fell when the definition 
was broadened further to include nonspectrum disor- 
ders, in keeping with observed risks in relatives for these 
traits. Multiple independent studies of this region of 6p 
observed evidence for linkage, as did a multicenter col- 
laborative study 81 and a robust meta-analysis. 46 
The dystrobrevin binding protein 1 or dysbindin 
(DTNBP1) gene was first reported to be associated in 
the same Irish families. 8283 Many studies support associ- 
ation in DTNBP1 in samples from diverse ethnic back- 
grounds although the markers, alleles and haplotypes 
associated vary significantly from study to study: 13 stud- 
ies of 15 independent samples reported significant pos- 
itive association with schizophrenia (most consistently 
with common alleles and the highest frequency common 
allele haplotype), 70 82 93 while 14 studies of 18 independent 
samples did no t. 61 6385 ' 94 104 A further four studies have also 
provided positive evidence for association of DTNBP1 
with bipolar disorder. 105 108 Although the function of 
DTNBP1 in brain is unknown, both RNA 109 and pro- 
tein 110 expression is reduced in cases. 

Chromosome lq and DISCI 

Interest in chromosome 1 in schizophrenia began with 
reports of a balanced 1:11 translocation segregating with 
serious mental illness in a large pedigree from 
Scotland. 111 The chromosome 1 breakpoint lies at lq42.1, 
and the breakpoint directly disrupts a novel gene, 
Disrupted in Schizophrenia 1 (DISCI). 112 There are now 
nine positive reports of association of DISCI with schiz- 
ophrenia 74113120 and 2 of association with positive symp- 
toms 121122 suggesting that this gene influences schizo- 
phrenia liability in the general population, as well as in 
the family with the chromosomal anomaly. Other rare 
variants in this gene besides the breakpoint have also 
been reported to be associated with schizophrenia 123124 
and association has been reported for additional psy- 
chiatric diagnoses, reviewed in ref 125, and for bipolar 
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disorder. 126 A smaller number of negative reports have 
also been published. 103 127 130 

Other chromosomal regions and genes 

Two additional chromosome regions, 5q22-q31, where 
association was recently reported in the interleukin-3 
(IL3) gene 131 and 15ql3-ql4, where evidence for linkage 
of an evoked potential abnormality common in 
patients 132 was supported by five additional studies 
reporting linkage of schizophrenia to the same narrow 
region, 133 137 show some overlap with the results of cur- 
rent studies discussed below. Other high-profile candi- 
date genes such as PRODH2 on 22q 138 and PPP3CC on 
8p 139 have not replicated well. One exception is AKTl, 140 
which has similar numbers of positive 141145 and nega- 
tive 61103146149 replications. 

Genome-wide association studies 

By assaying 500 000 to 1 000 000 DNA variants in a sin- 
gle experiment, GWAS provide unbiased genome-wide 
coverage, avoiding selection of candidate genes. They use 
an association framework for analysis, avoiding the 
weaknesses of linkage in complex traits. They impose 
stringent criteria due to the number of tests performed 
(typically around P<5 x 10 8 for genome-wide signifi- 
cance). They hold enormous potential to move beyond 
the identification of single genes (which may show small 
effects and be difficult to detect individually) toward the 
simultaneous identification of multiple genes through 
their interactions or involvement in systems. 
Seven GWAS of schizophrenia have been published to 
date, four of which were small and underpowered. The 
first (320 cases, 325 controls) was of limited density as it 
genotyped only 25 000 SNPs in 14 000 known genes, and 
did not detect any association that reached genome-wide 
significance 150 ; nominal association was reported in the 
plexin A2 (PLXNA2) gene. Only one of four samples 
tested in three independent studies replicates the asso- 
ciation. 151153 The second (extremely underpowered with 
178 cases, 144 controls) identified one genome-wide sig- 
nificant association in the X/Y pseudoautosomal region 
(a homologous region of the sex chromosomes where 
recombination can occur), near the interleukin 3 recep- 
tor (IL3R) gene. 154 Cytokines have been suggested as 
possible candidates previously and IL3 (in the 5q link- 
age region) was associated with schizophrenia in one 



study. 131 One replication attempt supported association 
in IL3R. 155 The third, using the CATIE 156 sample (738 
cases, 733 controls), did not detect any genome- wide sig- 
nificant results in its primary analysis. 157 The fourth, using 
a multistage design of discovery (479 cases, 2937 con- 
trols) and targeted replication (6666 cases, 9897 controls) 
samples, identified one genome-wide significant SNP in 
the zinc-finger protein transcription factor ZNF804A 
gene, 158 but only in the meta-analysis including the orig- 
inal sample. One independent replication attempt sup- 
ported the association of ZNF804A, and showed that 
expression was increased from the associated haplo- 
type. 159 

Three substantially larger GWAS of schizophrenia were 
published in 2009, in the SGENE+ sample 160 (multiple 
European sites, 2663 cases/13498 controls), the 
International Schizophrenia Consortium (ISC) sample 161 
(multiple European sites, 3322 cases/3587 controls) and 
the Molecular Genetics of Schizophrenia (MGS) sam- 
ple 162 (multiple US sites, European ancestry: 2681 
cases/2653 controls; African ancestry: 1286 cases/973 
controls), analyzed both separately and together. The 
one region of the genome with significant overlap in sig- 
nals from the 3 studies was the MHC region on chro- 
mosome 6p21.3-p22.1, site of some of the earliest genetic 
evidence in schizophrenia discussed above. The 
SGENE+ sample detected significant association with 
several markers spanning the MHC region, as well as 
signals upstream of the neurogranin (NRGN) gene on 
llq24.2 and in intron four of the transcription factor 4 
(TCF4) gene on 18q21.2.The ISC sample detected asso- 
ciation in -450 SNPs spanning the MHC region and the 
myosin XVIIIB (MY018B) gene on 22q and supported 
ZNF804A. The MGS sample did not detect any individ- 
ual genome-wide significant signals, but detected signals 
in the range of 10 5 -10 7 in the CENTG2 gene (reported 
deleted in autism cases 163 ) on chromosome 2q37.2 and 
JARID2 (the gene adjacent to DTNBP1) in European- 
ancestry subjects, and in ERBB4 and NRG1 in African- 
American subjects. 

Meta-analysis of data from all European-ancestry MGS, 
ISC and SGENE samples detected genome-wide signif- 
icant association signals for 7 SNPs spanning 209 Kb of 
the MHC region. LD is high between the 7 SNPs and 
extends over a region of 1.5 Mb on chromosome 6p22.1, 
making it difficult to determine if the signal is driven by 
one or many genes. The genie content of this region is not 
limited to histocompatibility loci, and also includes genes 
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involved in transcriptional regulation, DNA repair, chro- 
matin structure, G-protein-coupled-receptor signaling 
and the nuclear pore complex. 

Meta-analyses of schizophrenia linkage and 
association data 

The strongest linkage meta-analysis approach ranks 30 cM 
bins of the genome from most positive to least positive for 
each study, and then sums the ranks for each bin. 
Significance levels are calculated by simulation, and this 
method can identify regions of the genome where modest 
positive results occur across many studies. Results of this 
approach supported linkage to chromosomes 6p and 8p 
among the previously identified regions discussed above* 
The strongest evidence for a potential locus was on chro- 
mosome 2pll.l-q21.1, a region suggested by only a few 
studies and not widely followed up, and on 3p, the site of 
an early linkage finding that could never be replicated. 
A recent effort has been made to systematize the collec- 
tion and archiving of association data from studies of schiz- 
ophrenia, and to provide a framework for continuous 
updating of both the data and the meta-analytic results 164 
in the SzGene database (http://www.szgene.org/). Meta- 
analyses of the data contained in this resource provided 
support of varying degrees for 24 SNPs in 16 previously 
reported genes, including older candidate genes (eg, 
dopamine receptor 2 (DRD2) gene, those resulting from 
association-based follow-up of linkage data (eg,DTNBPl) 
and one suggested by one of the smaller GWAS 
(PLXNA2). Meta-analyses of schizophrenia GWAS data 
from at least 15 000 cases and 15 000 controls are sched- 
uled for completion in 2010. 

Rare structural variation in schizophrenia 

The epidemiological and genetic data above seems most 
consistent with the common disease/common variant 
hypothesis of the genetic risks for complex traits and the 
results of GWAS in other complex traits like type 2 dia- 
betes provided a major validation of this model. 165 168 The 
alternative common disease/rare variant hypothesis of 
genetic risks for complex traits has been proposed in 
schizophrenia, 169 largely based on the reduction in fertil- 
ity observed in cases. A key focus of research in this area 
has been the deletions, duplications, and inversions of a 
few thousand (Kb) to a few million (Mb) base pairs col- 
lectively known as structural variants, an area of intense 



research interest generally since 2004, ™" 172 reviewed in 
ref 173. As a class, these genomic rearrangements are 
common: -360 Mb or 12% of the genome is included in 
structural variation. 174 A few such variants occur at high 
frequency due to apparent selection in certain con- 
texts, 175 176 but studies of large samples consistently show 
that the majority of structural variants are rare (-50% 
detected in only one individual). 174 
The aggregate rate of such rare structural variants is sig- 
nificantly increased in individuals with schizophrenia in 
all four studies that have examined this question. 177180 
Critically, there is substantial overlap in the regions 
where excess structural variation is observed, most 
notably on chromosomes 22qll, 15ql3.3 and lq21.1, 
with some evidence that neurodevelopmental genes are 
overrepresented, as in 181 and more recently on 16pll.2. 182 
However, even considered in aggregate, structural vari- 
ants are observed in only 15% of schizophrenia cases, 
and so cannot account for a substantial fraction of the 
total population risk. Because they are rare, the true 
impact of individual structural variants on schizophre- 
nia is difficult to validate and interpret, although the 
replication of excess structural variation in cases on 
chromosomes 22qll, 15ql3.3, and lq21.1 is extremely 
encouraging. 

Summary of current gene-finding studies 

At both the technical/molecular and statistical/concep- 
tual levels, the science of gene discovery in complex dis- 
ease genetics is moving rapidly. By the time this paper is 
published, new developments are sure to have arisen. As 
is common in science in the state of rapid flux, the direc- 
tion ahead is far from clear. How will the modest but 
hard-fought advances obtained in more traditional posi- 
tional cloning and candidate gene work integrate with 
the new findings from GWAS? How will the common- 
variant SNP-based approach inter-relate with the emerg- 
ing rare-variant copy number variant findings? Will 
advances in phenotypic assessment or endophenotypes 
provide critical new insights? How will the burgeoning 
fields of bioinformatics, expression arrays, and pro- 
teomics impact on our gene-finding efforts? 
One emerging consensus is that the field needs to move 
from a "gene-centric" approach toward one that consid- 
ers "gene networks." For example, many of the candidate 
genes discussed above are involved in glutamatergic 
neurotransmission, which may be an important systemic 
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element in the etiology of schizophrenia. Although a 
detailed discussion of this theory is outside the scope of 
this summary, recent reviews of the genetic 183 and neu- 
roscience 184 data and evidence from other studies high- 
light the positions of the gene products of NRG1, 
COMT, and possibly DTNBP1 among others, in the bio- 
chemical and functional pathways influencing the gluta- 
matergic system. Many other possible networks may be 
involved in the etiology of schizophrenia that, if prop- 
erly articulated, could aid in our gene-discovery efforts. 

Conclusion 

We have attempted in this article to review the rapidly 
evolving field of psychiatric genetics. In the section on 
genetic epidemiology, we took a conceptual approach 
focusing on a range of the most interesting questions 
now being confronted by the field, with the goal of giv- 
ing the reader a "feel" for the issues. While examining 
a wide range of disorders, we focused on substance use 
and externalizing disorders because they clearly illus- 
trated the points we wanted to make. In the section on 
gene-finding, we decided it would be more useful to 
"drill down" and illustrate our important themes by 
focusing on one disorder — schizophrenia. 
The major theme that cuts across these two sections is 
the complexity of the pathways from genetic variation 
to psychiatric and substance use disorders. Results of the 
last 20 years have shown that the early prior simple 
hypothesis of large effect genes that directly causes psy- 
chiatric illness was seriously misplaced. We now know 
that multiple gene variants (as well as — for at least some 
disorders — genomic rearrangements) are involved at the 
DNA level. These genetic risk factors then act and inter- 
act with each other and with the environment in a com- 
plex developmental "dance" to produce individuals at 
high versus low risk of illness. It is this kind of complex- 
ity that the field is now confronting directly. 
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Lo innato y lo adquirido en la genetica 
neuropsiquiatrica: idonde estamos? 

Se piensa que los f adores de riesgo tanto geneticos 
como no geneticos, al igual que las interacciones y 
correlaciones entre ellos contribuyen a la etiologfa 
de los fenotipos psiquiatricos y conductuales. La 
epidemiologi'a genetica confirma consistentemente 
la participacion de genes en estos defectos. Los 
estudios de genetica molecular ban resultado 
menos exitosos en la identificacion de genes defec- 
tuosos, pero el progreso reciente sugiere que se ha 
identificado un numero de genes especificos que 
contribuyen al riesgo. En conjunto los resultados 
son complejos e inconsistentes, al considerar una 
sola variante comun de ADN en algun gen que 
influya en el riesgo en poblaciones humanas. Son 
pocas las variantes geneticas especi'ficas que influ- 
yen en el riesgo que se han identificado en forma 
inequivoca. Sin embargo, las aproximaciones actua- 
tes son prometedoras respecto a dilucidar mas 
genes y variantes defectuosas, como tambien sus 
potenciales interrelaciones entre ellos y con el 
ambiente. Se revisaran los campos de la genetica 
molecular y de la epidemiologi'a genetica, apor- 
tando ejemplos de la literatura para ilustrar los con- 
ceptos clave que surgen de este trabajo. 



L'inne et I'acquis en genetique 
neuropsychiatrique : oil en sommes-nous ? 

Des facteurs de risque genetiques et non gene- 
tiques, et leurs interactions et leurs correlations 
mutuelles, participeraient a I'etiologie des pheno- 
types psychiatriques et comportementaux. 
L'implication des genes de susceptibilite est regu- 
lierement confirmee par I'epidemiologie genetique. 
Des etudes de genetique moleculaire ont ete moins 
heureuses dans /'identification des genes de sus- 
ceptibilite, mais des progres recents suggerent que 
plusieurs genes specif iques participant au risque ont 
ete identifies. Pris collectivement, les resultats sont 
complexes et contradictoires avec un variant ADN 
unique present dans un gene, influant sur le risque 
a travers les populations humaines. Les variants 
genetiques specif iques influant sur le risque sont 
peu nombreux a avoir ete identifies sans ambiguite. 
Les approches actuelles sont cependant tres pro- 
metteuses pour /'identification future des genes de 
susceptibilite et de leurs variants, de leurs interre- 
lations eventuelles les uns avec les autres et avec 
I'environnement. Dans cette revue, nous analyse- 
rons les domaines de I'epidemiologie genetique et 
de la genetique moleculaire, des exemples de la lit- 
terature illustrant les idees phares de notre travail. 
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